Tagging and Linking Web Forum Posts

نویسندگان

  • Su Nam Kim
  • Li Wang
  • Timothy Baldwin
چکیده

We propose a method for annotating postto-post discourse structure in online user forum data, in the hopes of improving troubleshooting-oriented information access. We introduce the tasks of: (1) post classification, based on a novel dialogue act tag set; and (2) link classification. We also introduce three feature sets (structural features, post context features and semantic features) and experiment with three discriminative learners (maximum entropy, SVM-HMM and CRF).We achieve abovebaseline results for both dialogue act and link classification, with interesting divergences in which feature sets perform well over the two sub-tasks, and go on to perform preliminary investigation of the interaction between post tagging and linking.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Induction of Part-of-Speech Information for OOV Words in German Internet Forum Posts

We show that the accuracy of part-ofspeech (POS) tagging of German Internet forum posts can be improved substantially by exploiting distributional similarity information about out-of-vocabulary (OOV) words. Our best method increases the accuracy by +15.5% for OOV words compared to a standard tagger trained on newspaper texts, and by +12.7% if we use an already

متن کامل

From News to Comment: Resources and Benchmarks for Parsing the Language of Web 2.0

We investigate the problem of parsing the noisy language of social media. We evaluate four Wall-Street-Journal-trained statistical parsers (Berkeley, Brown, Malt and MST) on a new dataset containing 1,000 phrase structure trees for sentences from microblogs (tweets) and discussion forum posts. We compare the four parsers on their ability to produce Stanford dependencies for these Web 2.0 senten...

متن کامل

Extracting Adverse Drug Reactions from Forum Posts and Linking them to Drugs

Interest in medical data mining is growing rapidly as more healthrelated data becomes available online. We propose methods for extracting Adverse Drug Reactions (ADRs) from forum posts and linking extracted ADRs to the drugs that users claim are responsible for them. We evaluate our methodology using a corpus of annotated forum posts. We find that our ADR extraction method outperforms a strong ...

متن کامل

Overview of the SBS 2016 Mining Track

In this paper we present an overview of the mining track in the Social Book Search (SBS) lab 2016. The mining track addressed two tasks: (1) classifying forum posts as book search requests, and (2) linking book title mentions in forum posts to unique book IDs in a database. Both tasks are important steps in the process of solving complex search tasks within online reader communities. We prepare...

متن کامل

Personalizing Forum Search using Multidimensional Random Walks

Online forums are a vital resource for users to ask questions and to participate in discussions. Yet, the search functionality on such forum sites is very primitive; posts containing the searched keywords are retrieved in the order of their creation date. In these interactive and social web forum sites, users frequently make connections with other users due to shared interests, same information...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010